Recording device, recording method, reproducing device, reproducing method, and recording/reproducing device

ABSTRACT

It is possible for the viewer to readily and accurately reach a desired image/audio reproduction start position in reproduction. 
     A time code is added to moving image data obtained by imaging a state in which a person who writes a description is explaining while writing a description in a description portion and audio data corresponding to the moving image data to record the data in a recording unit. The moving image data is processed to determine a written portion written in the description portion, and index image data is generated to display each portion determined as the written portion as an index description, and the index image data is recorded in the recording portion. To the index image data, a value of the time code corresponding to description time is added as a timestamp, in association with each pixel constituting the index description.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No.16/482,005, filed on Jul. 30, 2019, and is based upon and claims thebenefit of priority to International Application No. PCT/JP2018/007908,filed on Mar. 1, 2018 and from the prior Japanese Patent Application No.2017-048188 filed on Mar. 14, 2017. The entire contents of each of thesedocuments are incorporated herein by reference.

TECHNICAL FIELD

The present technology relates to a recording device, a recordingmethod, a reproducing device, a reproducing method, and arecording/reproducing device, and more particularly, to a recordingdevice and the like for recording moving image data and audio data of astate in which a person who writes a description is explaining whilewriting a description in a description portion.

BACKGROUND ART

It is known to generate an index entry in conventional recording ofimage/sound of a state in which a person who writes a description isexplaining while writing a description in a description portion. Forexample, in Patent Document 1, a method of generating an index entry isproposed, in which the index entry is generated when the writing of theperson who writes a description is interrupted for a predetermined time.However, the index entry is not generated on the basis of the content ofthe writing by this method, and the index entry is not generated unlessthe writing is interrupted. Therefore, in reproduction, it is difficultfor a viewer to reach a desired reproduction start position referring tothe index entry.

CITATION LIST Patent Document

-   Patent Document 1: Japanese Patent Application Laid-Open No.    2002-101398

SUMMARY OF THE INVENTION Problems to be Solved by the Invention

An object of the present technology is to make it possible for a viewerto readily and accurately reach a desired image/audio reproduction startposition in reproduction.

Solutions to Problems

A concept of the present technology is to provide

a recording device including:

an image/audio recording unit for adding a time code to moving imagedata obtained by imaging a state in which a person who writes adescription is explaining while writing a description in a descriptionportion, and audio data corresponding to the moving image data to recordthe data in a recording unit; and

an index-image recording unit for processing the moving image data,determining a written portion in the description portion, generatingindex image data for displaying, as an index description, portionsdetermined as the written portion, and recording the index image data ina recording unit,

in which, to the index image data, a value of the time codecorresponding to description time is added as a timestamp, inassociation with pixels constituting the index description.

In the present technology, the image/audio recording unit adds the timecode to the moving image data obtained by imaging a state in which theperson who writes a description is explaining while writing adescription in the description portion is giving an explanation, andaudio data corresponding to the moving image data and records the datain the recording unit.

The index-image recording unit processes the moving image data todetermine the written portion written in the description portion. Forexample, the index-image recording unit may be configured to extract adifference value between a current frame image and a reference frameimage for each pixel, grasp groups of a series of pixels having adifference value equal to or more than a threshold value, and determinewhether each group has the written portion or not.

In this case, for example, the index-image recording unit may beconfigured to determine whether each group has the written portion ornot by using a shape evaluation value, extraction time evaluation value,and stationary time evaluation value, and the index-image recording unitmay be configured to determine the shape evaluation value on the basisof aspect ratios of a minimum square containing the group and a maximumsquare contained in the group, obtain the extraction time evaluationvalue on the basis of the value of a difference from the reference frameimage, in a pixel included in the group, over a current frame and aplurality of frames prior to the current frame, and obtain thestationary time evaluation value on the basis of the value of adifference from a frame image of a preceding frame, in a pixel includedin the group, over a current frame and a plurality of frames prior tothe current frame.

The index-image recording unit generates index image data fordisplaying, as the index description, each portion determined as thewritten portion and records the index image data in the recording unit.This recording unit may be the same as or different from the recordingunit for recording the moving image data and the audio data. Here, tothe index image data, a value of the time code corresponding todescription time is added as a timestamp, in association with each pixelconstituting the index description.

For example, when an erased state of a written portion in thedescription portion is recognized more than a certain number of timesfor a fixed time period, the index-image recording unit may determinethe index image data having been generated up to that point as one itemof index image data and proceeds to a process of generating new indeximage data. Thus, each time a written portion in the description portionis erased in large by the person who writes a description, new indeximage data is generated.

As described above, in the present technology, moving image data isprocessed to determine portions as written portions, index image datafor displaying the written portions as index descriptions is generatedand recorded in the recording unit, and, to the index image data, thevalues of time codes corresponding to description time are added astimestamps, in association with pixels constituting the indexdescriptions. Thus, upon reproduction, the index image data is used todisplay an index image having an index description, and a viewer is ableto specify an appropriate position on the basis of the content of anindex description to readily and accurately reach an image/audioreproduction start position corresponding to description timecorresponding to the specified position.

Note that, in the present technology, for example, the recording devicemay further include a character recognition unit to perform characterrecognition on an index description to recognize a word, in which theindex image data may include data on this word in association with eachpixel constituting the index description corresponding to the recognizedword. Thus, upon reproduction, it is possible, for example, to obtainand display related information by using the data on the word recognizedby the character recognition.

Furthermore, a concept of the present technology is to provide

a reproducing device including:

an index-image display control unit for reading index image data from arecording unit to display, on an index-image display unit, an indeximage having an index description, on the basis of the index image data,the index image data including a value of a time code value, added as atimestamp, corresponding to description time in association with eachpixel constituting the index description;

a user operation unit for specifying an appropriate position of an indexdescription of an index image displayed on the display unit; and

a reproduction control unit for controlling a reproduction startposition in moving image data and audio data recorded in a recordingunit on the basis of the timestamp added to the index image data inassociation with a pixel at the specified position.

In the present technology, the index-image display control unit readsindex image data from the recording unit, and, on the basis of the indeximage data, an index image having an index description is displayed onthe index-image display unit. Here, to the index image data, a value ofthe time code corresponding to description time is added as a timestamp,in association with each pixel constituting the index description.Moving image data and audio data are recorded with a time code, in therecording unit.

The user operation unit specifies an appropriate position of the indexdescription of the index image displayed on the display unit. On thebasis of a timestamp added to the index image data in association with apixel at the specified position, the reproduction control unit controlsa reproduction start position in the moving image data and audio datarecorded in the recording unit.

For example, the reproduction control unit may be configured to display,as a still image, an image at the reproduction start position on areproduced-image display unit, before starting reproduction. In thiscase, for example, the reproduction control unit may be configured to,when receiving a reproduction instruction from the user operation unitwhile an image at a reproduction start position is displayed as a stillimage on the reproduced-image display unit, start reproduction of themoving image data and the audio data, from the reproduction startposition. In this case, the viewer is able to issue a reproductioninstruction after confirming the image at the reproduction startposition.

Furthermore, the reproduction control unit may be configured to display,on an image list display unit, thumbnail images of frames within apredetermined range including the reproduction start position while animage at a reproduction start position is displayed as a still image onthe reproduced-image display unit. In this case, for example, when apredetermined thumbnail image is selected from the user operation unit,the reproduction control unit may change the reproduction start positionto a position of a frame of this thumbnail image. In this case, theviewer can easily and appropriately change the reproduction startposition with reference to the thumbnail images.

As described above, in the present technology, the index image having anindex description is displayed on the basis of the index image data.Specifying an appropriate position in the index description controls thereproduction start position in the moving image data and the audio datarecorded in the recording unit on the basis of the timestamp added tothe index image data in association with a pixel at the specifiedposition. Therefore, the viewer can specify an appropriate position onthe basis of the content of an index description to readily andaccurately reach an image/audio reproduction start positioncorresponding to description time corresponding to the specifiedposition.

Note that, in the present technology, for example, in the reproducingdevice, in association with each pixel constituting an index descriptioncorresponding to a word recognized using character recognition, data onthe word is added to the index image data, and the reproducing devicefurther includes an information acquisition unit for acquiring, inassociation with a pixel corresponding to a position designated by theuser operation unit, on the basis of data on a word added to the indeximage data, information relating to the word, in which the reproductioncontrol unit displays the acquired information over the image at thereproduction start position.

Furthermore, in the present technology, for example, in the reproducingdevice, in association with each pixel constituting an index descriptioncorresponding to a word recognized using character recognition, data onthe word is added to the index image data, and the reproduction controlunit, and the reproduction control unit may be configured so that,together with a thumbnail image of a frame corresponding to thereproduction start position, while an image at a reproduction startposition is displayed as a still image on the image display unit,displays, in an image list display unit, a thumbnail image at a positionof a timestamp added in association with a pixel corresponding toanother position with which data on a word the same as data on a wordadded in association with a pixel corresponding to the specifiedposition is added in association. Then, in this case, when apredetermined thumbnail image is selected from the user operation unit,the reproduction control unit may change the reproduction start positionto a position of a frame of this thumbnail image.

Effects of the Invention

According to the present technology, it is possible for the viewer toreadily and accurately reach a desired image/audio reproduction startposition in reproduction. It is noted that the effects described hereinare by way of example only and not limited to the example, and may havefurther additional effects.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a block diagram illustrating a configuration example of arecording/reproducing system according to an embodiment.

FIG. 2 is an exemplary view of an image of a certain frame captured by acamera.

FIG. 3 is an exemplary view of an index image based on index image data.

FIG. 4 is a diagram illustrating time codes corresponding to descriptiontime, having values added (embedded) as timestamps in association withpixels constituting an index description.

FIG. 5 is an exemplary view of a display screen upon reproduction.

FIG. 6 is a block diagram illustrating a configuration example of arecording/reproducing device.

FIG. 7 is a flowchart illustrating a process in an index imagegeneration unit.

FIG. 8 is a diagram illustrating how to obtain the values in shapeevaluation.

FIG. 9 is a block diagram illustrating a configuration example of arecording/reproducing device where character recognition is performed onan index description included in an index image.

FIG. 10 is an explanatory view of reference information displayed on thebasis of data on a word added to index image data.

MODE FOR CARRYING OUT THE INVENTION

Modes for carrying out the invention (hereinafter, referred to asembodiments) will be described below. Note that the description will begiven in the following order.

1. Embodiments

2. Modifications

1. EMBODIMENTS Configuration Example of Recording/Reproducing System

FIG. 1 illustrates a configuration example of an image processing system10 according to an embodiment. The image processing system 10 includes acamera 101 and a recording/reproducing device 102.

The camera 101 captures images of a state in which a person 201 whowrites a description is speaking (supplement, explanation) while writinga description 203 in a description portion 202 to obtain moving imagedata and audio data corresponding to the moving image data. In thiscase, the camera 101 is fixed to have an angle of view in which theperson 201 who writes a description and the description portion 202 canbe imaged and is located at a position at which the speech of the person201 who writes a description is collected.

FIG. 2 illustrates an example of an image of a certain frame captured bya camera 101. In this captured image, the description 203 is positionedon the upper side of the description portion 202, and the person 201 whowrites a description is positioned in front of the description portion202. When the person 201 who writes a description moves or gestures, aportion of the description 203, which is behind the person 201 whowrites a description, in this frame, can also be imaged by the camera101 in a subsequent frame.

As the person 201 who writes a description includes, for example, aperson who gives a lecture, such as a professor who gives a lecture to astudent in a university lecture room, a company member who gives apresentation to other members in a meeting room of the company, and thelike. The description portion 202 is a writable portion, such as a whiteboard or a blackboard and may be a writable wall portion in a lectureroom or a meeting room.

The recording/reproducing device 102 performs the following processingduring recording. In other words, the recording/reproducing device 102adds a time code to moving image data and audio data which are obtainedby using the camera 101 and records the data in a recording unit.Furthermore, the recording/reproducing device 102 processes moving imagedata obtained by using the camera 101 to determine a written portion,generates index image data for displaying, as an index description, eachportion determined as the written portion, and records the index imagedata in a recording unit. Note that the recording unit for recording themoving image data and the audio data and the recording unit forrecording the index image data may be the same or different and include,for example, a semiconductor memory.

The recording/reproducing device 102 extracts a difference value betweena current frame image and a reference frame image for each pixel, graspsgroups of a series of pixels having a difference value equal to or morethan a threshold value, and determines whether each group has thewritten portion or not. Specifically, the recording/reproducing device102 determines whether each group has the written portion or not byusing a shape evaluation value, extraction time evaluation value, andstationary time evaluation value.

The recording/reproducing device 102 obtains the shape evaluation valuefrom a relationship between the aspect ratios of a minimum squarecontaining the group and a maximum square contained in the group.Furthermore, the image processing device 102 obtains the extraction timeevaluation value on the basis of the value of a difference from thereference frame image, in a pixel included in the group, over a currentframe and a plurality of frames prior to the current frame. Furthermore,the image processing device 102 obtains the stationary time evaluationvalue on the basis of the value of a difference from a frame image of apreceding frame, in a pixel included in the group, over a current frameand a plurality of frames prior to the current frame.

When an erased state of the portion of a description 203 in thedescription portion 202 is recognized more than a certain number oftimes for a fixed time period, the recording/reproducing device 102determines the index image data having been generated up to that pointas one item of index image data and proceeds to a process of generatingnew index image data. Thus, the recording/reproducing device 102generates new index image data each time the portion of the description203 written in the description portion 202 is completely erased by theperson 201 who writes a description. Therefore, at the end of recording,in addition to the moving image data and the audio data, one item ofindex image data or a plurality of items of index image data (indeximage data group) is recorded in the recording unit.

To the index image data, a value of the time code corresponding to thetime at which a description is written is added as a timestamp, inassociation with each pixel constituting the index description. FIG. 3illustrates an example of an image based on the index image data, thatis, an index image. This index image includes an index description 203A.

FIG. 4 illustrates time codes corresponding to description time, havingvalues added (embedded) as timestamps in association with pixelsconstituting the index description. In the illustrated example, atimestamp “t0” corresponding to the time at which each pixel has beendescribed is associated with each pixel constituting the top of thecharacter “A”. Furthermore, in the illustrated example, a timestamp“t13” corresponding to the time at which each pixel has been describedis associated with each pixel constituting the bottom of the character“B”.

Furthermore, in reproduction, the recording/reproducing device 102performs the following processing. In other words, therecording/reproducing device 102 reads index image data from therecording unit and displays an index image having an index description,on an index-image display unit, on the basis of the index image data. Inthis state, when the viewer operates the user operation unit to specifythe position of a desired description content of the index descriptionin the index image, the recording/reproducing device 102 controls areproduction start position in the moving image data and the audio datarecorded in the recording unit, on the basis of the timestamp added tothe index image data in association with a pixel corresponding to thespecified position.

In this case, the recording/reproducing device 102 displays, as a stillimage, an image at the reproduction start position on a reproduced-imagedisplay unit, before starting reproduction. In this state, when theviewer operates the user operation unit to instruct reproduction, therecording/reproducing device 102 starts reproduction of the moving imagedata and the audio data from the reproduction start position, and startsimage display and audio output.

In addition, the recording/reproducing device 102 displays thumbnailimages of frames within a predetermined range including the reproductionstart position, on an image list display unit, while the image at thereproduction start position is displayed, as a still image, on the imagedisplay unit. This predetermined range is defined as, for example, onlya range over frames prior to the reproduction start position or a rangeover frames prior to and subsequent to the reproduction start position.In this state, when the viewer operates the user operation unit toselect a desired thumbnail image, the recording/reproducing device 102changes the reproduction start position to a frame position of theselected thumbnail image.

FIG. 5 illustrates an example of a display screen upon reproduction.This display screen includes an index-image display unit 301, an imagelist display unit 302, and a reproduced-image display unit 303. Asdescribed above, the index-image display unit 301 displays an indeximage based on index image data. In the illustrated example, there arethree index images identified by “Index 00”, “Index 01”, and “Index 02”,and an index image of “Index 00” selected by the viewer is displayed onthe front side.

Furthermore, the illustrated example shows that the viewer operates theuser operation unit to specify a position (two-dimensional coordinateposition) of an index description in the index image of “Index 00”,indicated by a mouse pointer 304. In the reproduced-image display unit303, a frame image having a time code value indicated by a timestampadded to the index image data in association with a pixel correspondingto the specified position is displayed, as a still image, for an imageat the reproduction start position.

In the image list display unit 302, thumbnail images of frames within apredetermined range including the reproduction start position aredisplayed. In the illustrated example, thumbnail images within the rangeover frames prior to and subsequent to the reproduction start positionare displayed in time series. In this state, when the viewer selects adesired thumbnail image, the reproduction start position is changed to aframe position of the selected thumbnail image, and the still, imagedisplayed in the reproduced-image display unit 303 and the thumbnailimages displayed in the image list display unit 302 are changed with thechange of the reproduction start position.

Note that, in the illustrated example, the thumbnail images over fourframes, that is, two frames prior to the frame at the reproduction startposition and one frame subsequent thereto, are displayed, but the numberof frames to be displayed is not limited thereto. Furthermore, frames donot need to be displayed at an interval of one frame and may bedisplayed at an interval of a predetermined number of frames.Furthermore, this frame interval may be appropriately adjustable by theuser.

When, while the frame image at the reproduction start position isdisplayed as a still image in the reproduced-image display unit 303, theviewer operates the user operation unit to instruct reproduction,reproduction of the moving image data and the audio data is started fromthe reproduction start position, and thus, the reproduced-image displayunit 303 sequentially displays the reproduced images from thereproduction start position. In addition, in synchronization with thedisplay of the reproduced images, reproduction sound is also output froman audio output unit, which is not illustrated.

[Configuration of Recording/Reproducing Device]

FIG. 6 illustrates a configuration example of the recording/reproducingdevice 102. The recording/reproducing device 102 includes a control unit121, a user operation unit 122, and a communication interface(communication I/F) 123. The recording/reproducing device 102 furtherincludes a data input unit 131, an encoding unit 132, arecording/reproducing unit 133, an index image generation unit 134, adecoding unit 135, a display processing unit 136, an image display unit137, and an audio output unit 138.

The control unit 121 controls the operation of each unit of therecording/reproducing device 102. The user operation unit 122constitutes a user interface and includes a key, a touch panel, a remotecontroller, or the like for the user (the viewer in the case ofreproduction) to perform various operations. The communication interface123 communicates with a server located on a network, such as theInternet, under the control of the control unit 121, and acquires, forexample, desired information from this server.

The data input unit 131 is a portion for inputting moving image data andaudio data obtained by using the camera 101. For example, the data inputunit 131 may be a cable connection unit connecting the camera 101.Furthermore, for example, when the data input unit 131 is to beconnected to the camera 101 with communication, the data input unit 131may have a wireless or wired communication interface.

The encoding unit 132 encodes moving image data and audio data input tothe data input unit 131. During recording, the recording/reproducingunit 133 records the encoded image and audio data obtained by theencoding unit 132, in a recording medium such as a semiconductor memory,not illustrated, with a time code TC added thereto.

The index image generation unit 134 processes the moving image datainput to the data input unit 131, determines the portion of thedescription 203 written in the description portion 202, and generatesindex image data for displaying, as an index description, portionsdetermined as the written portion. In this case, a value of the timecode TC corresponding to description time is added to the index imagedata, in association with each pixel constituting the index description.The index image generation unit 134 performs, for each frame, theprocess from the start to the end of the flowchart of FIG. 7 bysoftware, for example, on the basis of a program.

First, in step ST1, the index image generation unit 134 extracts adifference in each pixel between an image of a certain frame and areference frame image (initially, a frame image in a state beforewriting a description in the description portion 202), extracts, from aresult of difference extraction, a pixel having a large difference inpixel value, that is, a pixel having been changed from a state of thereference frame image, and grasps a group (a predetermined number ofpixels) of a series of such pixels.

As described above, the group grasped in step ST1 includes the portionof the description 203, but also includes the portion of the person 201who writes a description or the like. In the following steps, it isdetermined whether each group has the portion of the description 203 ornot. For this determination, values of shape evaluation, extraction timeevaluation, and stationary time evaluation are used.

Next, in step ST2, the index image generation unit 134 obtains a value αof the shape evaluation. The index image generation unit 134 obtainsthis value α, on the basis of a relationship between aspect ratios of aminimum square (outer square) containing a group and the maximum square(inner square) contained in the group. FIG. 8(a) illustrates an exampleof a group corresponding to the portion of the description 203, and asquare indicated by an arrow Po represents an outer square, and a squareindicated by an arrow Pi represents an inner square. Furthermore, FIG.8(b) illustrates an example of a group corresponding to the portion ofthe person 201 who writes a description, and a square indicated by anarrow Qo represents an outer square, and a square indicated by an arrowQi represents an inner square.

In a case of a group corresponding to the portion of the description203, the inner square (Pi) is smaller than the outer square (Po), andthe value α of the shape evaluation becomes larger. On the other hand, agroup corresponding to the portion of the person 201 who writes adescription is larger than the group corresponding to the portion of thedescription 203, and the inner square (Qi) also has a certain sizecompared to the outer square (Qo). Therefore, the value α of the shapeevaluation becomes smaller.

Next, in step ST3, the index image generation unit 134 obtains a value βof the extraction time evaluation. The index image generation unit 134obtains this value β on the basis of the value of a difference from thereference frame image, in a pixel included in the group, over a currentframe and a plurality of frames prior to the current frame, and as thenumber of frames having a large value of a difference increases, thevalue β becomes larger.

For example, the index image generation unit 134 determines whether aframe (difference extraction frame) has a large difference from thereference frame image, for pixels (all pixels or some pixels) includedin the group, accumulates the results of the determination, and definesthe number of continuous difference extraction frames (for example, anaverage value, if this evaluation is performed on a plurality of pixels)as an evaluation value β.

In the case of the portion of the description 203, a large difference iskept extracted after writing, the evaluation value β becomes larger. Onthe other hand, in the case of a moving portion such as the person 201who writes a description, when the person 201 who writes a descriptionhas passed, extraction of a large difference is interrupted, and theevaluation value β becomes smaller. However, when the person 201 whowrites a description stays without moving, the evaluation value β maybecome larger in spite of the portion of the person 201 who writes adescription.

Next, in step ST4, the index image generation unit 134 obtains a value γof the stationary state evaluation. The index image generation unit 134obtains this value γ on the basis of the value of a difference from aframe image of a preceding frame, in a pixel included in the group, overa current frame and a plurality of frames prior to the current frame,and as the number of frames having a large value of a differenceincreases, the value γ becomes larger.

For example, the index image generation unit 134 accumulates a pixeldifference from a preceding frame for pixels (all pixels or some pixels)included in the group and determines the reciprocal of the value of thepixel difference (for example, an average value, if this evaluation isperformed on a plurality of pixels) as an evaluation value γ.

In the case of the portion of the description 203, the value of a pixeldifference from an image of the preceding frame is zero or small, andthe evaluation value γ becomes larger. On the other hand, in the case ofthe portion of the person 201 who writes a description where the personmoves or gestures, the value of a pixel difference from the image of thepreceding frame is large, and the evaluation value γ becomes smaller.

Next, in step ST5, the index image generation unit 134 substitutes thevalues α, β, and γ of the respective evaluations obtained in steps ST2to ST4 in an evaluation function E (α, β, γ) to obtain an evaluationvalue for determining whether there is the description or not. Then, instep ST5, when the evaluation value is larger than a threshold valueTHr, the index image generation unit 134 determines that the group has awritten portion (the portion of the description 203).

Next, in step ST6, the index image generation unit 134 holds each pixelincluded in the group determined as the written portion in step ST5, asa pixel of an index description. Note that, the group determined as thewritten portion in this way also includes a group including pixelshaving been held before as the written portion. This group is recognizedto be erased after that and is not held again as the pixels of the indexdescription.

Here, the word “hold” means, for example, to write and hold the value ofeach pixel included in a group determined as the written portion, to acorresponding pixel position of a frame memory. Here, instead of writingthe value itself of each pixel included in the group determined as thewritten portion, to the corresponding pixel position of the framememory, only information on the pixel position of each pixel included inthe group determined as the written portion may be held.

When holding each pixel included in the group determined as the writtenportion as a pixel of the index description, the index image generationunit 134 adds, as the time stamp, the value of a time code correspondingto description time, for example, a value of the time code TC at thetime point at which the group is determined as the written portion, inassociation with each pixel (see FIG. 4).

Next, in step ST7, the index image generation unit 134 generates a frameimage in which the value of each pixel currently held as the indexdescription has a value corresponding to the description 203 and theframe image is changed to a new reference frame image. In descriptiondetermination processing in the next frame, the reference frame imagethus updated in this way is used.

Next, in step ST8, the index image generation unit 134 determines theend of generation of one item of index image data. In this case, whenthe erased state described above is recognized a certain number of timesor more for a fixed time period, it is determined that generation of oneitem of index image data is finished, determining the index image datahaving been generated up to that point as one item of index image data,the process proceeds to the generation of new index image data. Theindex image generation unit 134 also determines that generation of oneitem of index image data is finished at the end of recording of movingimage data and audio data, and determines the index image data havingbeen generated up to that point as one item of index image data.

Returning to FIG. 6, during recording, the recording/reproducing unit133 adds the time code TC to the encoded image and audio data obtainedby the encoding unit 132 as described above and records the data in therecording medium and records, in the recording medium, a predeterminednumber of items of index image data generated by the index imagegeneration unit 134.

Furthermore, in reproduction, the recording/reproducing unit 133reproduces a predetermined number of items of index image data from therecording medium and transmits the index image data to the displayprocessing unit 136. In addition, in response to an operation from theviewer for specification of a reproduction start position, reproductioninstruction, or the like, the recording/reproducing unit 133 reproducesthe encoded image and audio data required for image display or audiooutput, from the recording medium, under the control of the control unit133 and transmits the data to the decoding unit 133. The decoding unit135 decodes the encoded image and audio data to obtain image data andaudio data.

On the basis of the image data obtained by the decoding unit 135 and theindex image data obtained by the recording/reproducing unit 133, andfurther in response to the operation of the viewer from the useroperation unit 122, the display processing unit 136 generates displayimage data for obtaining the display screen (see FIG. 5), under thecontrol of the control unit 121. The image display unit 137 presents thedisplay screen based on the display image data obtained by the displayprocessing unit 136. Furthermore, the audio output unit 138 outputs anaudio based on the audio data obtained by the decoding unit 135.

A brief description will be given of the operation of therecording/reproducing device illustrated in FIG. 6. Firstly, theoperation during recording will be described. Moving image data andaudio data which are obtained by the camera 101 are input to the datainput unit 131. The moving image data and audio data are supplied to theencoding unit 132 and encoded. The encoded image and audio data aresupplied to the recording/reproducing unit 133 and recorded in therecording medium after a time code TC is added.

Furthermore, the moving image data input from the camera 101 to the datainput unit 131 is supplied to the index image generation unit 134. Inthe index image generation unit 134, the moving image data is processedto determine the portion of a description 203 written into thedescription portion 202, and index image data is generated to displaythe portions determined as the written portion as an index description.The index image data is supplied to the recording/reproducing unit 133and recorded in the recording medium.

Next, the operation during reproduction will be described. In therecording/reproducing unit 133, the index image data is reproduced fromthe recording medium and supplied to the display processing unit 136. Inthe display processing unit 136, on the basis of the index image data,display image data is generated to display an index image in theindex-image display unit 301 of the display screen. The display imagedata is supplied to the image display unit 137, and the display screenon which the index image is displayed is presented in the index-imagedisplay unit 301.

In this state, when the viewer operates the user operation unit 122 tospecify a desired position (two-dimensional coordinate position) of theindex description in the index image, the control unit 121 recognizesthat a frame having a time code value indicated by a timestamp added tothe index image data in association with a pixel corresponding to thespecified position is a reproduction start position.

At this time, in the recording/reproducing unit 133, under the controlof the control unit 121, the encoded image data of a frame at thereproduction start position and frames prior to and subsequent to theframe are reproduced, and the encoded image data is decoded by theencoding unit 135 to be supplied to the display processing unit 136. Inthe display processing unit 136, on the basis of the image data of eachframe, display image data is generated to display, as a still image, animage at the reproduction start position in the reproduced-image displayunit 303 of the display screen, and to display thumbnail images of theframes prior to and subsequent to the reproduction start position in theimage list display unit 302.

The display image data is supplied to the image display unit 137 andpresents the display screen (see FIG. 5) in which the index image isdisplayed in the index-image display unit 301, the image at thereproduction start position is displayed in the reproduced-image displayunit 303, and the thumbnail images of the frames prior to and subsequentto the reproduction start position are displayed in time series in theimage list display unit 302.

In this state, when the viewer operates the user operation unit 122 toselect a thumbnail image different from a thumbnail image at thereproduction start position, the control unit 121 recognizes that thereproduction start position is changed to the frame position of theselected thumbnail image. At this time, in the recording/reproducingunit 133, under the control of the control unit 121, the encoded imagedata of a frame at the changed reproduction start position and framesprior to and subsequent to the frame are reproduced, and the encodedimage data is decoded by the encoding unit 135 to be supplied to thedisplay processing unit 136. Thus, in the display screen presented onthe image display unit 137, image display in the reproduced-imagedisplay unit 303 and the image list display unit 302 corresponds to thechanged reproduction start position.

When, while the image at the reproduction start position is displayed asa still image in the reproduced-image display unit 303, the vieweroperates the user operation unit 122 to instruct reproduction, therecording/reproducing unit 133 reproduces, under the control of thecontrol unit 121, encoded image and audio data from the reproductionstart position, the encoded image and audio data are decoded by thedecoding unit 135, and moving image data and audio data are obtained.

The moving image data obtained by the decoding unit 135 is supplied tothe display processing unit 136. In the display processing unit 136, onthe basis of the moving image data, display image data is generated todisplay a moving image in the reproduced-image display unit 303 of thedisplay screen. The display image data is supplied to the image displayunit 137, and the display screen on which the moving image is displayedis presented in the reproduced-image display unit 303. Note that, inthis case, the moving image may be displayed all over the display screenpresented on the image display unit 137.

Furthermore, the audio data obtained by the decoding unit 135 issupplied to the audio output unit 138. The audio output unit 138 isconfigured to provide an audio output corresponding to the moving imagedata displayed on the image display unit 137.

Note that, when reproduction from another reproduction start position isdesired, the viewer preferably selects a desired index image in theindex-image display unit 301 on the display screen presented on theimage display unit 137 and performs an operation to specify a desiredposition of the index description in the selected index image.

As described above, in a recording/reproducing system 10 illustrated inFIG. 1, the recording/reproducing device 102 is configured to record,upon recording, an index image data generated to display, as an indexdescription, each portion determined as a written portion determined byprocessing moving image data, and further the recording/reproducingdevice 102 is configured to add, as a timestamp, a value of a time codecorresponding to description time in association with each pixelconstituting the index description to the index image data. Thus, uponreproduction, the index image data is used to display an index imagehaving an index description, and a viewer is able to specify anappropriate position on the basis of the content of an index descriptionto readily reach an image/audio reproduction start positioncorresponding to description time corresponding to the specifiedposition.

Furthermore, in the recording/reproducing system 10 illustrated in FIG.1, the recording/reproducing device 102 is configured so that, uponreproduction, an index image having an index description is displayed onthe basis of index image data and an appropriate position of the indexdescription is specified, and the reproduction start positions in movingimage data and audio data are controlled on the basis of a timestampadded to the index image data in association with a pixel correspondingto the specified position. Therefore, the viewer can specify anappropriate position on the basis of the content of an index descriptionto readily and accurately reach an image/audio reproduction startposition corresponding to description time corresponding to thespecified position.

Furthermore, in the recording/reproducing system 10 illustrated in FIG.1, the recording/reproducing device 102 is configured so that, uponreproduction, an image at a reproduction start position is displayed asa still image in the reproduced-image display unit 303 before thereproduction is started, and in response to a reproduction instructionfrom the viewer, the reproduction of moving image data and audio data isstarted from the reproduction start position. Thus, the viewer is ableto issue a reproduction instruction after confirming the image at thereproduction start position.

Furthermore, in the recording/reproducing system 10 illustrated in FIG.1, the recording/reproducing device 102 is configured so that, uponreproduction, while an image at a reproduction start position isdisplayed as a still image in the reproduced-image display unit 303,thumbnail images of frames within a predetermined range including thereproduction start position are displayed in the image list display unit302, and when the viewer selects a predetermined thumbnail image, thereproduction start position is changed to the position of a frame of theselected thumbnail image. Thus, the viewer can easily and appropriatelychange the reproduction start position with reference to the thumbnailimages.

For example, upon recording sounds and images of lectures, it isexpected that the amount of image/audio recorded will continue toincrease as the number of lectures increases, and total recording timewill be enormous. Upon browsing, it is impossible to review the hugeamount of image/audio recorded, requiring accurate and quick browsing ofa necessary portion.

In the present technology, the contents of descriptions written by theperson who writes a description are projected in the index images, andthus, viewing the index images enables to grasp an outline of the wholeimage. Furthermore, on the basis of the content of a description showingan item for which details in the index image are desired to be browsed,the image/audio of the desired content can be immediately browsed. Theimage/audio may help the viewer to understand the lecture, leading tounderstanding of the content in a shorter time.

Furthermore, for example, upon recording a meeting, the record helpsquick browsing of a portion of the content of the meeting requiredlater, increasing efficiency and reducing man-hour even in a businesssituation.

2. MODIFICATIONS

Note that, in the above embodiment, it has been described that, to theindex image data, the value of a time code corresponding to descriptiontime is added as a timestamp, in association with each pixelconstituting an index description. Character recognition may beperformed on the index description included in the index image torecognize a word, and further, data (text data) on the recognized wordmay be added to the index image data, in association with each pixelconstituting the index description corresponding to the word.

FIG. 9 illustrates a configuration example of a recording/reproducingdevice 102A where character recognition is performed on an indexdescription included in an index image. The recording/reproducing device102A is configured similar to the recording/reproducing device 102Aillustrated in FIG. 6 and further includes a character recognition unit139.

Under the control of the control unit 121, the character recognitionunit 139 reads index image data recorded in the recording medium of therecording/reproducing unit 133 and, on the basis of the index imagedata, performs character recognition processing on an index descriptionincluded in an index image to recognize a word included in the indexdescription. Furthermore, in the character recognition unit 139, data(text data) on the recognized word is further added to the index imagedata, in association with each pixel constituting the index descriptioncorresponding to the word. The index image data to which the data on theword is added in this way is supplied to the recording/reproducing unit133 and recorded in the recording medium again, and the index image datais held.

Adding the data on the word included in the index description to theindex image data enables, for example, to continuously reproduce allimages corresponding to appearance time with a word having highappearance frequency as a trigger or enables to display the meaning ofthe word or a related image as reference information. This referenceinformation is acquired, for example, from a server on the Internet bythe communication interface 123.

FIG. 10 illustrates an example of display of reference information. Theillustrated example shows that the viewer operates the user operationunit to specify a position (two-dimensional coordinate position) of anindex description in the index image of “Index 00”, indicated by themouse pointer 304. In this case, in the reproduced-image display unit303, a frame image having a time code value indicated by a timestampadded to the index image data in association with a pixel correspondingto the specified position is displayed, as a still image, for an imageat the reproduction start position.

Furthermore, the illustrated example shows that data on the word “ABCD”is added in association with a pixel corresponding to the specifiedposition. Therefore, reference information relating to the word “ABCD”is acquired, and a reference information image 305 is displayed to besuperimposed itself on the reproduced-image display unit 303. Note that,in the illustrated example, the viewer specifies the portion of thecharacter “D”, but, for example, even when the viewer specifies any ofcharacters “A” to “C”, data of the word “ABCD” is added to the indeximage data, in association with the characters, and the referenceinformation is displayed in a similar manner.

Furthermore, for example, the addition of the data of a word included inthe index description to the index image data may make possibledetection in the index image or detection between index images forspecifying a portion of a recognized word in a recorded moving image. Inthis case, in the image list display unit 302 of FIG. 5, not thetime-series thumbnail images but a thumbnail image of a frame at thereproduction start position corresponding to a position specified by theviewer and a thumbnail image of another frame in which a word the sameas the word at the specified position are displayed, side by side. Then,when a desired thumbnail image is selected by the viewer, thereproduction start position is changed to the position of the frame ofthe selected thumbnail image.

Note that, as described above, in a case where data (text data) on arecognized word is added to the index image data, in association witheach pixel constituting the index description corresponding to the word,the value of a time code TC corresponding to one or more other frameswhere the same word appears may be added as the timestamp.

Furthermore, in the above embodiment, the moving image data obtained byimaging a state in which the person 201 who writes a description isexplaining while writing a description 203 in the description portion202, such as a white board or a blackboard, is processed, but othersimilar moving image data may also be processed. For example, movingimage data obtained by imaging a state in which the person who writes adescription is explaining while writing a description on paper on a deskmay be processed in a similar manner to facilitate reproduction.

In addition, a description on the description portion 202 is not limitedto only characters and may be a figure or a mark and the description isnot limited to a written description but may be displays placed on thedescription portion 202.

Furthermore, the present technology may also include the followingconfigurations.

(1) A recording device including:

an image/audio recording unit for adding a time code to moving imagedata obtained by imaging a state in which a person who writes adescription is explaining while writing a description in a descriptionportion, and audio data corresponding to the moving image data to recordthe data in a recording unit; and

an index-image recording unit for processing the moving image data,determining a written portion in the description portion, generatingindex image data for displaying, as an index description, portionsdetermined as the written portion, and recording the index image data ina recording unit,

in which, to the index image data, a value of the time codecorresponding to description time is added as a timestamp, inassociation with pixels constituting the index description.

(2) The recording device according to (1), in which

the index-image recording unit

extracts a difference value between a current frame image and areference frame image for each pixel, grasps groups of a series ofpixels having a difference value equal to or more than a thresholdvalue, and determines whether each group has the written portion or not.

(3) The recording device according to (2), in which

the index-image recording unit determines whether each of the groups hasthe written portion or not by using a shape evaluation value, extractiontime evaluation value, and stationary time evaluation value, and

the index-image recording unit

determines the shape evaluation value on the basis of a relationshipbetween the aspect ratios of a minimum square containing the group and amaximum square contained in the group,

obtains the extraction time evaluation value on the basis of the valueof a difference from the reference frame image, in a pixel included inthe group, over a current frame and a plurality of frames prior to thecurrent frame, and

obtains the stationary time evaluation value on the basis of the valueof a difference from a frame image of a preceding frame, in a pixelincluded in the group, over a current frame and a plurality of framesprior to the current frame.

(4) The recording device according to any of (1) to (3), in which

the index-image recording unit,

when an erased state of a written portion in the description portion isrecognized more than a certain number of times for a fixed time period,determines the index image data having been generated up to that pointas one item of index image data and shifts to a process of generatingnew index image data.

(5) The recording device according to any of (1) to (4), furtherincluding

a character recognition unit to perform character recognition on anindex description to recognize a word,

in which the index image data includes data on this word in associationwith each pixel constituting the index description corresponding to therecognized word.

(6) A recording method including:

an image/audio recording step of adding a time code to moving image dataobtained by imaging a state in which a person who writes a descriptionis explaining while writing a description in a description portion, andaudio data corresponding to the moving image data to record the data ina recording unit, by using an image/audio recording unit; and

an index-image recording step of processing the moving image data,determining a written portion in the description portion, generatingindex image data for displaying, as an index description, portionsdetermined as the written portion, and recording the index image data ina recording unit, by using an index-image recording unit,

in which, to the index image data, a value of the time codecorresponding to description time is added as a timestamp, inassociation with pixels constituting the index description.

(7) A reproducing device including

an index-image display control unit for reading index image data from arecording unit to display, on an index-image display unit, an indeximage having an index description, on a basis of the index image data,the index image data including a value of a time code value, added as atimestamp, corresponding to description time in association with eachpixel constituting the index description;

a user operation unit for specifying an appropriate position of an indexdescription of an index image displayed on the display unit; and

a reproduction control unit for controlling a reproduction startposition in moving image data and audio data recorded in a recordingunit on a basis of the timestamp added to the index image data inassociation with a pixel corresponding to the specified position.

(8) The reproducing device according to (7), in which

the reproduction control unit

displays, as a still image, an image at the reproduction start positionon a reproduced-image display unit, before starting reproduction.

(9) The reproducing device according to (8), in which

the reproduction control unit, when receiving a reproduction instructionfrom the user operation unit while an image corresponding to areproduction start position is displayed as a still image on thereproduced-image display unit, starts reproduction of the moving imagedata and the audio data, from the reproduction start position.

(10) The reproducing device according to (8) or (9), in which

the reproduction control unit

displays, on an image list display unit, thumbnail images of frameswithin a predetermined range including the reproduction start positionwhile an image at a reproduction start position is displayed as a stillimage on the reproduced-image display unit.

(11) The reproducing device according to (10), in which

the reproduction control unit,

when a predetermined thumbnail image is selected from the user operationunit, changes a reproduction start position to a position of a frame ofthis thumbnail image.

(12) The reproducing device according to any of (8) to (11), in which

in association with each pixel constituting an index descriptioncorresponding to a word recognized using character recognition, data onthe word is added to the index image data,

the reproducing device further includes an information acquisition unitfor acquiring, in association with a pixel corresponding to a positiondesignated by the user operation unit, on the basis of data on a wordadded to the index image data, information relating to the word, and

the reproduction control unit

displays the acquired information over the image at to the reproductionstart position.

(13) The reproducing device according to any of (8) to (12), in which

in association with each pixel constituting an index descriptioncorresponding to a word recognized using character recognition, data onthe word is added to the index image data, and the reproduction controlunit, and

the reproduction control unit,

together with a thumbnail image of a frame corresponding to thereproduction start position, while an image at a reproduction startposition is displayed as a still image on the image display unit,displays, in an image list display unit, a thumbnail image at a positionof a timestamp added in association with a pixel corresponding toanother position with which data on a word the same as data on a wordadded in association with a pixel corresponding to the specifiedposition is added in association.

(14) The reproducing device according to (13), in which

the reproduction control unit,

when a predetermined thumbnail image is selected from the user operationunit, changes a reproduction start position to a position of a frame ofthis thumbnail image.

(15) A reproducing method including:

an index-image display control step of reading index image data from arecording unit to display, on a display unit, an index image having anindex description, on the basis of the index image data, by using anindex-image display control unit, the index image data including a valueof the time code value, added as a timestamp, corresponding todescription time in association with each pixel constituting the indexdescription; and

a reproduction control step of controlling a reproduction start positionin moving image data and audio data recorded in a recording unit on thebasis of a timestamp added to the index image data in association with apixel corresponding to the specified position, in response tospecification of an appropriate position of an index description of anindex image displayed on the display unit, by using a reproductioncontrol unit.

(16) A recording/reproducing device including:

an image/audio recording unit for adding a time code to moving imagedata obtained by imaging a state in which a person who writes adescription is explaining while writing a description in a descriptionportion, and audio data corresponding to the moving image data to recordthe data in a first recording unit;

an index-image recording unit for processing the moving image data,determining a written portion in the description portion, generatingindex image data for displaying, as an index description, portionsdetermined as the written portion, and recording the index image data ina second recording unit, the index image data including a value of thetime code value, added as a timestamp, corresponding to description timein association with each pixel constituting the index description;

an index-image display control unit for reading index image data fromthe second recording unit to display, on a display unit, an index imagehaving an index description, on the basis of the index image data;

a user operation unit for specifying an appropriate position of an indexdescription of an index image displayed on the display unit; and

a reproduction control unit for controlling a reproduction startposition in moving image data and audio data recorded in the firstrecording unit on the basis of the timestamp added to the index imagedata in association with a pixel corresponding to the specifiedposition.

REFERENCE SIGNS LIST

-   10 Recording/reproducing system-   101 Camera-   102, 102A Recording/reproducing device-   121 Control unit-   122 User operation unit-   123 Communication interface-   131 Data input unit-   132 Encoding unit-   133 Recording/reproducing unit-   134 Index image generation unit-   135 Decoding unit-   136 Display processing unit-   137 Image display unit-   138 Audio output unit-   139 Character recognition unit-   301 Index-image display unit-   302 Reproduced-image display unit-   303 Image list display unit-   304 Mouse pointer-   305 Reference information image

1. An information processing apparatus comprising: circuitry configuredto identify a written portion on a description portion in captured imagedata, generate index image data as an index description based on theidentified written portion, and add a time code corresponding to adescription time in the captured image data to the index image data as atimestamp.
 2. The information processing apparatus according to claim 1,wherein the circuitry is further configured to add the time code inassociation with pixels included in the index description.
 3. Theinformation processing apparatus according to claim 1, wherein thecircuitry is further configured to record the index image data in arecorder.
 4. The information processing apparatus according to claim 1,wherein to identify the written portion, the circuitry is configured to:extract a difference value between a current frame of the captured imagedate and a reference frame, select groups of series of pixels havingdifferent values equal to or more than a threshold value, and determinewhether each of the selected groups includes the written portion.
 5. Theinformation processing apparatus according to claim 4, wherein todetermine whether each of the selected groups includes the writtenportion, the circuitry is configured to use at least one of a shapeevaluation value, an extraction time value, and a stationary timeevaluation value.
 6. The information processing apparatus according toclaim 5, wherein the circuitry is further configured to determine theshape evaluation value based on a relationship between aspect ratios ofa minimum square including a respective one of the selected groups and amaximum square included in the respective one of the selected groups. 7.The information processing apparatus according to claim 5, wherein thecircuitry is further configured to obtain the extraction time valuebased on a value of a difference from the reference frame image, in apixel included in a respective one of the selected groups, over acurrent frame and a plurality of frames prior to the current frame. 8.The information processing apparatus according to claim 5, wherein thecircuitry is further configured to obtain the stationary time evaluationvalue based on a value of a difference from a frame image of a precedingframe, in a pixel of a respective one of the selected groups, over acurrent frame and a plurality of frames prior to the current frame. 9.The information processing apparatus according to claim 1, wherein thecircuitry is further configured to, when an erased state of the writtenportion is recognized more than a predetermined number of times,determine that the index image data is generated up to that point as oneitem of index image data and shift to a process of generating new indeximage data.
 10. The information processing apparatus according to claim1, wherein the circuitry is further configured to: perform characterrecognition on the index description to recognize a word, wherein theindex image data includes data on the recognized word in associationwith each pixel included in the index description corresponding to therecognized word.
 11. An information processing method, comprising:identifying, with circuitry, a written portion on a description portionin captured image data, generating, with the circuitry, index image dataas an index description based on the identified written portion, andadding, with the circuitry, a time code corresponding to a descriptiontime in the captured image data to the index image data as a timestamp.12. The information processing method according to claim 11, furthercomprising adding the time code in association with pixels included inthe index description.
 13. The information processing method accordingto claim 11, further comprising recording the index image data in arecorder.
 14. The information processing method according to claim 11,further comprising: extracting a difference value between a currentframe of the captured image date and a reference frame, selecting groupsof series of pixels having different values equal to or more than athreshold value, and determining whether each of the selected groupsincludes the written portion.
 15. The information processing methodaccording to claim 14, wherein to determine whether each of the selectedgroups includes the written portion at least one of a shape evaluationvalue, an extraction time value, and a stationary time evaluation valueis used.
 16. The information processing method according to claim 15,further comprising determining the shape evaluation value based on arelationship between aspect ratios of a minimum square including arespective one of the selected groups and a maximum square included inthe respective one of the selected groups.
 17. The informationprocessing method according to claim 15, further comprising obtainingthe extraction time value based on a value of a difference from thereference frame image, in a pixel included in a respective one of theselected groups, over a current frame and a plurality of frames prior tothe current frame.
 18. The information processing method according toclaim 15, further comprising obtaining the stationary time evaluationvalue based on a value of a difference from a frame image of a precedingframe, in a pixel of a respective one of the selected groups, over acurrent frame and a plurality of frames prior to the current frame. 19.The information processing method according to claim 11, furthercomprising, when an erased state of the written portion is recognizedmore than a predetermined number of times, determining that the indeximage data is generated up to that point as one item of index image dataand shifting to a process of generating new index image data.
 20. Anon-transitory computer-readable medium encoded with computer-readableinstructions that, when executed by a computer, cause the computer toperform a method comprising: identifying a written portion on adescription portion in captured image data, generating index image dataas an index description based on the identified written portion, andadding a time code corresponding to a description time in the capturedimage data to the index image data as a timestamp.